Rank | Count | Beginning |
---|---|---|
2245 | 825 | El |
5421 | 698 | La |
3545 | 446 | En |
3011 | 347 | Els |
15 | 322 | A |
4942 | 286 | I |
7624 | 246 | Per |
4148 | 231 | És |
7103 | 202 | No |
7815 | 175 | Però |
6375 | 166 | Les |
9473 | 144 | Un |
9076 | 123 | També |
9475 | 117 | Una |
8572 | 115 | Segons |
1811 | 105 | De |
628 | 98 | Amb |
8815 | 87 | Si |
411 | 83 | Al |
920 | 81 | Aquest |
9310 | 75 | Tot |
1504 | 74 | Com |
907 | 69 | Aquesta |
9771 | 62 | Va |
1995 | 61 | Després |
286 | 59 | Això |
200 | 53 | Així |
4830 | 52 | Hi |
8261 | 51 | Quan |
1116 | 46 | Ara |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV